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[57] ABSTRACT 

The data storage management system provides the capability 
to move and/or copy the placeholder files from one file 
server volume to another file server volume, even if the 
destination file server volume resides on a different file 
server. This is accomplished by the use of unique and 
immutable migration keys which are included in the place- 
holder entries to identify the location of the associated data 
file with absolute certainty. In addition, a duplicate copy of 
the placeholder catalog file is maintained in the system to 
prevent loss of file system integrity in the event that the 
active placeholder catalog file is corrupted. This placeholder 
data is maintained in a placeholder volume catalog which 
provides users with a file-system-like view of the place- 
holder files which reside on the selected volume. 

9 Claims, 6 Drawing Sheets 
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DATA STORAGE MANAGEMENT FOR file using information stored in the placeholder entry and 

NETWORK INTERCONNECTED transmits the retrieved data file to the data storage device 

PROCESSORS USING TRANSFERRABLE from whence it originally came. The storage server, backend 

PLACEHOLDERS data storage and processor resident software modules create 

5 a virtual storage capacity for each of the data storage devices 

CROSS-REFERENCE TO RELATED i n a manner that is transparent to both the processor and the 

APPLICATION user> E ac h virtual volume in this system can be expanded in 

This application is a continuation-in-part of a patent exlent in a seamless manner to match the needs of the 
application titled "Data Storage Management For Network processor by using low cost mass storage devices. 

Interconnected Processors/' Ser. No. 08/650,114, filed May 30 A difficulty with the placeholder concept described in this 

22, 1996, U.S. Pat. No. 5,832,522 which is a divisional of a reference is that a data file, which is denoted by a 
patent application titled "Data Storage Management For placeholder, cannot be relocated from one data storage 

Network Interconnected Processors," Ser. No. 08/201,658, volume to another data storage volume and still retain the 
filed Feb. 25, 1994 and issued as U.S. Pat. No. 5,537,585.' ability to be recalled by the data storage management 

is system. In addition, the placeholder catalog files are stored 

FIELD OF THE INVENTION on the file servers associated with the processors which are 

This invention relates to data communication networks, ™ nn f ted to *™f*work. These placeholder catalog files are 

such as local area networks, that function to interconnect a f heref ° re P otentiall y ex P osed *> corruption by the users or 

plurality of data processors with data storage subsystems, , n b ? other programs extant on the processors. Damage to a 

and to a data storage management system that automatically 20 P laceholder cata l°g ^ can cause the data storage manage- 

migrates low priority data files from the data storage sub- m / nl s y stem [ ° b * un * b <* 10 recail data files ; ^ the lack 

systems to backend data storage to provide more available °^ P[ ote ^ n of placeholder catalog files and the portability 

data storage space in the data storage subsystems, with the of P^hc-lders among the data storage volumes are limita- 

ability to transfer ownership of these files among the data . tl0ns of this data stora S e mana g em ™t system, 

processors. c SOLUTION 

PROBLEM above-described problems are solved and a technical 

advance achieved in the field by the data storage manage - 

It is a problem in the field of local area networks to men t system of the present invention. The data storage 

provide both adequate data storage resources for the pro- 3 o management system provides the capability to move and/or 

cessors connected to the network as well as efficient data copy the data files from one file server volume to another file 

storage management capability associated with the data server volume, even if the destination file server volume 

storage subsystems that are connected to the network and resides on a different file server. This is accomplished by the 

which serve the processors. The data storage system US e of unique and immutable migration keys which are 

described in U.S. Pat. No. 5,537,585 solves this problem by 35 included in the placeholder entries to identify the location of 

providing a hierarchical data storage capability to migrate the associated data file with absolute certainty. Since the 

lower priority data files from the data storage subsystems system users and administrators can rename the storage 

that are connected to the network to backend less expensive server, the unique migration key includes timestamp data 

data storage media, such as optical disks or magnetic tape. which relates the storage server name with a specific point 

This data storage management system implements a vir- 40 in time, which enables the data storage management system 
tual data storage system, comprising a plurality of virtual file to accurately track the present identity of the storage server, 
systems, for the processors that are connected to the net- i n addition, a duplicate copy of the placeholder catalog 
work. The virtual data storage system consists of a first fife is maintained in the data storage management system to 
section that comprises a plurality of data storage prevent loss of file system integrity in the event that the 
subsystems, each consisting of file servers and their associ- 45 active placeholder catalog file is corrupted. This placeholder 
ated data storage devices, which are connected to the net- data is maintained in a placeholder volume catalog which 
work and serve the processors. A second section of the provides users with a file-system-like view of the place- 
virtual data storage system comprises the . storage server, holder files which reside on the selected volume. The 
consisting of a storage server processor and at least one layer placeholder volume catalogs are not used for file access 
of hierarchically arranged data storage devices, that provides 50 purposes, but, instead, are used to ensure that placeholder 
backend data storage space. The storage server processor restoration can be accomplished in the event that the place- 
interfaces to software components stored in each processor holders are corrupted and to provide the users with data 
and file server that is connected to the network. The storage volume content viewing. Each placeholder volume catalog 
server, on a demand basis and/or on a periodically scheduled res ides on the file server to which it pertains and tracks the 
basis, audits the activity on each volume of each data storage 55 placeholders for all HSM servers. In this manner, the integ- 
device that is connected to the network. Data files that are of niy of the placeholders is ensured. 

lower priority are migrated via the network and the storage ^ data storage mana gement system therefore enables 

server to backend data storage media. The data file directory ownership of data files to be transferred among the file 

resident m the data storage device that originally contained servers with the continued use of the placeholder paradigm, 

this data file is updated with a placeholder entry in the 60 This movable placeholder concept is enhanced by the use of 

directory to indicate that this data file has been migrated to placeholder volume catalogs which prevent the loss of 

backend data storage. Therefore, when a processor requests placeholder data, 
this data file, the placeholder entry is retrieved from the 

directory and the storage server is notified that the requested BRIEF DESCRIPTION OF THE DRAWING 

data file has been migrated to backend storage and must be 65 FIG. 1 illustrates in block diagram form the overall 

recalled to the data storage device from which it originated. architecture of a typical local area network that includes the 

The storage server automatically retrieves the requested data data storage management system of the present invention; 
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FIG. 2 illustrates in table form the architecture and invention and there are numerous alternate embodiments of 

content of a typical placeholder; this system that are possible. 

FIG. 3 illustrates in conceptual view the architecture of In addition to the processors 21, 22 and the file servers 

the hierarchical memory of the data storage management 41-43, the data storage management system 50 is connected 

system of the present invention; 5 to the local area network 1. The data storage management 

FIG. 4 illustrates a physical implementation of the trier- system 50 includes storage server processor 51 which serves 

archical memory of the data storage management system of t0 interface the local area network 1 with the backend data 

the present invention; storage devices 61-64 (FIG. 4) that constitute the secondary 

FIG. 5 illustrates 'in block diagram form the data file 1(1 stora 8 e 52 ' The u b t ck ^ d data storage devices 61-64, in 

migration and backup paths taken in the data storage man- 10 combination with the file servers 41-43 comprise a hierar- 

aeement svslem- chical data storage system. The backend data storage devices 

^ , ' . , , , 61-64 typically include at least one layer of data storage that 

FIG 6 migrates in block diagram form various compo- ^ , ess ^ than the dedicated data storage devices 31 _ 33 

nents of the hierarchical storage manager software; of (he file 41 ^ 3 , 0 provide a mofe cost . effective 

FIG. 7 illustrates in block diagram form the client-server is da ta storage capacity for the processors 21, 22. The data 

view of the system of the present invention; storage management system implements a virtual data stor- 

FIG. 8 illustrates in flow diagram form the file recall age space for the processors 21, 22 that are connected to the 

operation; and local area network 1. The virtual data storage space consists 

FIG. 9 illustrates a typical directory structure used by a of a first section A that comprises a primary data storage 

file system. 20 device 31 that is connected to the network 1 and used by 

processors 21, 22. Asecond section B of the virtual memory 

DETAILED DESCRIPTION comprises the secondary storage 52 managed by the storage 

server processor 51. The secondary storage 52 provides 

Local area networks are increasingly becoming an inte- add i tiona i data storage capacity for each of the primary data 

gral feature in die business environment. FIG. 1 illustrates in 25 storage devices 31 _ 33> re p re sented on FIG. 1 as the virtual 

block diagram form the overall architecture of a typical local devices 31S _ 33S attached in phantom to the primary data 

area network 1 and the incorporation of the data storage st devices3 i_ 33 0 f the file servers41-«3. Processor 21 

management system of the present invention into the local ^ (hereb presented with the ima of a ter it da(a 

area network 1 A local area network 1 consists of data st device 31 , han ^ presemly connected to the file 

communication link 11 and software (not shown) that inter- 30 server 41. The storage server 51 interfaces to software 

connects a plurality of processors 21, 22 with a number of components stored in each processor 21, 22 and file server 

file servers 41-43. The processors 21, 22 can be personal 4M3 that ^ conn ected to the local area network 1. 

computers, work stations, mini-computers or any other The ^ ^ $1 ^ 

processing element. For the s.mphc.ty of description, all of on a iodicall basiS( audits ^ activit on eadj 

nese devices are described by tne generic term processor . 35 volume of each dala storage device 31 _ 33 of the file 

servers 

While many of hese processors 21, 22 may contain a ^ ^ afe « ^ nelwofk L Data fi , es afe 

significant amount of data storage capacity, it is not uncom- . % 4 , . tU , . , Al _ 

& o , , t 1 1 * u • j ■ i- j j' of lower priority are migrated via the network 1 and the 

mon for a local area network 1 to be equipped with addi- ^ ^ 51 tQ backend data stof ^ q{ 

tional data storage capacity to supplement that of the pro- me secon(J ^ J2 ^ da , a fiJfi direct * esident m 

cessors 21, 22 themselves. The data storage devices 31-33 40 t u* <?i ai *u > • • » * • a *u- j * ^1 • 

, «. j * «i_ j • • v 1 n r the file server 41 that originally contained this data file is 

hat are connected o the data communication link 11 of the dated wjth lacehoIder B ent ' to indicate that this data file 

local area network 1 are typ.cafly high-speed random access £ been m{ ^ tQ backen(J ^ _ when 

devices, such as high capacity disk drives or even disk drive ^ ^ » n {s ^ data fil £ laceholder entr ^ 

arrays, to thereby substantially be co m pa, lb le with the fc from fifc di ' and ^ stora ^ 

operating speed or tne processors a, u and the data 45 proC essor51 is notified that the requested data file has been 

communication link 11. Each data storage device 31-33 is • , A t , , A A t 4 , 4 . 1tJ 

. A . * t r migrated to backend data storage media and must be recalled 

included m a file server 41, work station 42 or other type of , 7. ^, 4i c , • \ .„ . . . , T 4U c 

a<\ « • « »» . . A - . * to the file server 41 from which it originated. In the case of 

server 43, which functions as an interface between the -m j ai • * * *u * 

i ^ « j * a j <%i , t-t a processor 21, 22 and 42 that interfaces to a user, the storage 

network 1 and the data storage device 31-33, such as a disk en -a ,u -*u »■ u 

, . r * • . r j ... « « ' server 50 may provide the user with a notification, where 

drive, ror simplicity ot description, the data storage capacity <n #u , j i u . i- «u 

• i 3 u ,l /-i ^ j • 6 . ; , J 5U necessary, that a time delay may be noted in accessing the 

provided by the hie server 41-43 and its associated data _ . j\i ♦ *i m. / « . 

j . *i . r j . ,, , . requested data nle. The storage server processor 51 auto- 
storage device 31-33 is referred to as "file served herein. \. u * • *u t D , , 4 * u , , , 4 

& matically retneves the requested data file from backend data 
Each processor 21 that is connected to the local area storage ^ transmi ts it to the data storage device 31 from 
network 1 is typically capable of accessing at least one data whence it originaUy came . ^ storage server processor 51, 
storage volume on one of these file servers 41 as directly 55 secondary storage 52 and processor resident software mod- 
accessible additional data storage space for the use of this ules create a virtual storage capacity for each of the file 
processor 21 to store data files. The term data files is used to servers 4^3 in a manner that ^ transparent to both the 
characterize the various data that can be stored on data processor 21, 22 and the user. Each virtual volume in this 
storage devices and includes data managed by file servers, system can be expanded in extent m a sea mless manner to 
databases, application servers, and note systems, which are 60 match the needs of the proC essors 21, 22 by using low cost 
collectively referred to as "file servers" herein. In this mass storage devices to implement the secondary storage 52. 
system, the local area network 1 provides a communication 

fabric over which processors 21, 22 and the file servers Hierarchical Storage Management Architecture 

41-43 communicate via a predetermined protocol. The FIG. 3 illustrates the philosophical architecture and FIG. 

disclosed configuration and implementation of the local area 65 4 illustrates one possible hardware implementation of the 

network 1 and its protocol, processors 21, 22, file servers hierarchical data storage management (HSM) system. The 

41-43 as described herein are simply illustrative of the user at a processor 21 interfaces with a primary data storage 
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device P via the network 1. The primary storage device P 
consists of a file server 41 and its associated data storage 
device(s) 31, such as a disk drive. The file server 41 manages 
the data storage media of the associated data storage device 
31 in well known fashion. The data storage device 31 is 5 
typically divided into a number of volumes, which can be 
called file server volumes. 

As illustrated in FIG. 3, the secondary storage 52 is 
divided into at least one and more likely a plurality of layers 
311-313, generally as a function of the media used to 10 
implement the data storage devices 61-64. In particular, the 
second layer 311 of the hierarchical data storage, which is 
the first layer of the secondary storage 52, can be imple- 
mented by high speed magnetic storage devices 61. Such 
devices include disk drives and disk drive arrays. The third 15 
layer 312 of the hierarchical data storage, which is the 
second layer of the secondary storage 52, can be imple- 
mented by optical storage devices 62. Such devices include 
optical disk drives and robotic media storage and retrieval 
library systems. The fourth layer 313 of the hierarchical data 20 
storage, which is the third layer of the secondary storage 52, 
can be implemented by slow speed magnetic storage devices 
63. Such devices include magnetic tape drives and robotic 
media storage and retrieval library systems. An additional 
layer 314 of the hierarchical data storage can be imple- 25 
mented by the use of a "shelf layer", which can be imple- 
mented by manual storage of media. This disclosed hierar- 
chy is simply illustrative of the data storage management 
concept and the number, order and implementation of the 
various layers can differ from that disclosed herein. 30 

As can be seen in FIG. 3, data files can migrate from the 
file server volumes of the first section A of the virtual 
memory to the data storage devices 61-64 of the second 
section B of the virtual memory. In addition, these data files 
can further be relocated from the first layer 311 of the 35 
secondary storage 52 to the second 312 and third layers 313 
of the secondary storage 52 as a function of the activity of 
the data file, as indicated in FIG. 3. Further, the data file can 
be recalled directly to the file server volumes from any layer 
of the secondary storage 52. 40 

Data Management System Software 

The data management system software of the present 
invention manages the flow of data files throughout the 
HSM system. The block diagram of FIG. 7 illustrates a 45 
conceptual client-server view of the network and the data 
management system software. The data communication link 
11 of the local area network 1 is illustrated having the 
storage server processor 51 and three file systems 41-43 
attached thereto. The storage server processor 51 includes 50 
the network operating system 111 as well as the data storage 
management system software consisting of various media 
and device management user interfaces 112 and control and 
services software 113. Each file server 41-43 includes a 
storage server agent 121-123 and any processor of the 55 
network can include and run an administrative user interface 
131. The control and services software 113 looks at the HSM 
system as a set of clients that are connected to the network 
1 and which require services from the storage server 50. 
Each file server 41-43 communicates with the storage server 60 
processor 51 via the resident storage server agent software 
121-123. Thus, the data management system software is 
distributed throughout the network and serves to transpar- 
ently integrate all the elements connected to the network into 
the data storage hierarchy. 65 

The storage server agent 121-123 represents a component 
that is installed in each file server 41-43 in the local area 
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network 1 and functions to redirect requests for migrated 
data files from the file server 41-43 which was the original 
repository of the requested data file to the storage server 50. 
The storage server agent 121-123 provides whatever inter- 
faces are required to redirect data file access from the file 
server 41-43 to the storage server processor 51 and second- 
ary storage 52. In the case of a processor 21, 22, 42 that 
interfaces to a user, the storage server 50 may provide the 
user with a notification that a time delay may be noted in 
accessing the requested data file. Thus, the storage server 
agent 121-123 has a personality that is tailored to the 
underlying client file server platform or environment. For 
example, where the file server is a database management 
server, the storage server agent interfaces with the database 
management system object manager to allow automatic 
migration and recall of database objects, which can be 
viewed as sub-files. Another example is the NetWare® file 
system access manager which traps any NetWare® sup- 
ported file system calls at the file server. This also allows the 
automatic recall of migrated data files to be triggered. 

Using these basic elements, numerous variations of the 
local area network 1 can be configured, having multiple 
processors 21, 22 and multiple file servers 41-43, each with 
their attached data storage devices 31-33. The processor 51 
on which the storage server software mns includes a physi- 
cal interface to the data communication link 11 of the local 
area network 1. 

Routine Sweep Operation 

FIG. 5 illustrates in block diagram form the data file 
migration and backup paths taken in the data storage man- 
agement system. The device manager 504 of storage server 
50 is activated by operations kernel 501 and sweeps the 
migration candidates from the selected managed network 
volume, transmits and assembles them into a transfer unit (as 
described below) within the top layer 311 in the secondary 
storage 52. The migration candidate data file is selected by 
the operations kernel 501 and removed from the managed 
volume of data storage device 31, after transmitting the data 
file via network interface 503, the data communication link 
11 of network 1 and network interface 502 to the storage 
server 50 and checking that the data file has been transferred 
correctly. Storage server 50 thus writes the transfer unit 
containing the transferred data file and other data files to 
level 1 (311) of the secondary storage 52. 

The data file is listed in the data file directory 511 of the 
network volume on which the processor 21 has written the 
data file. This directory listing is modified by the operations 
kernel 501 to enable the processor 21 to obtain the data file 
whether it is stored on the managed volume in the network 
volume or on a volume in the secondary storage 52. This is 
accomplished by the operations kernel 501 providing a 
"placeholder entry" in the data file directory 511 of the 
managed volume. This entry lists the data file as having an 
extent of "0" and data is provided in the directory attributes 
or metadata area for the data file that points to the catalog 
entry, created by systems services 505, in the secondary 
storage directory 531 that lists the relative storage location 
in the secondary storage 52 that contains the migrated data 
file. The directory of the relative location of a particular data 
file in secondary storage 52 is maintained in the network 
volume itself. This is accomplished by the use of a second- 
ary storage directory 531 that is maintained in file server 41 
by the operations kernel 501 and systems services 505 of 
storage server 50. The data file directory 511 and secondary 
storage directory 531 can both be written on the data storage 
device 31 of file server 41. 
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The use of a migration key or pointer in the placeholder tially diverse file systems of the client file servers 41-43. 

entry to indicate the secondary storage directory entry for The file system structure of the data management system 

the requested data file is preferably accomplished by storing must not only accept the data files from the file servers 

the migration key as part of the data file attributes in the file 41-43, but must also serve the backend data storage, data 

system. This enables both the placeholder entry and the 5 rec^ data backup, data relocate and disaster recovery 

secondary storage directory to survive data file renaming functions that are inherent in the data management system, 

activity on the part of the requesting processor. File systems wherein lhe media used for these can vary widely . 

commonly rename data files and if the migration key were The m ^ cafl be afl « date {q ^ medi ^ ag 

part of the file name it would be lost in the renaming ti disk Qr can haye qq] u d „ biliti such 

activity. However, data file attributes are preserved as part of „ rt * / « , 4 . ■ „ . 

a data file renaming procedure. When a data file rename 10 as ma S netlc * a P e - ™ e data file transfers are typically large 

occurs, the name ascribed to this data file is modified and the m extent and ™* be j uch tha f data backu P and data ^ loca ^ 

entry in the network directory is suddenly placed in a operations can be performed in an efficient manner. Typical 

different part of the file system primary storage directory. of flle s y stem architecture is a common DOS file system, 

The data file attributes are transported in unmodified form whose organization is illustrated in FIG. 9. This file system 

with the new data file name and, since the placeholder entry, 15 nas four basic components: 

with its migration key, is part of the data file attributes, the 1. File naming convention. 

newly renamed data file attributes still point to the correct 2 . Directory architecture, to organize data files by name so 

secondary storage directory entry. Thus, the virtual segment th be easi] located 

of the file system automatically tracks the renaming of the _ . ; „ , , , . 

data files in the primary segment of the file system. 20 3 - Ph y sical s P ace avocation scheme that relates data file 

The migrated data file is received by the storage server 50 to . toc ? tK>n on a data forage media, and 

and written at a selected available data storage space in a wh ! ch al j 0W * d f t Si ™& *~ * be Utlllzed and 

, r j * j • A i • \ i reclaimed when data files are deleted, 
migration volume of a data storage device 61 in level one 

311 of the secondary storage 52. In addition, if shadow 4 * Flle management scheme, including access methods, 

volumes 64 are provided in the secondary storage 52 for data 25 For example, DOS data files are named with a 1-8 byte 

reliability purposes, the migrated data file is also written into name and a °~ 3 [ b y ic ext ension, which are delimited by a "." 

selected available data storage space on one of the shadow (nnnnnnnn.xxx). The directory architecture is illustrated in 

volumes 64. Groups of data files stored on the shadow FIG " 9 and takes lhe form of a hierarchical tree of directory 

volumes 64 are also periodically dumped after a period of names ' ^ ro ° l ls l yP lcall y a volume, from which a number 

sweep activity has occurred via a special backup drive 71 on 30 of directories branch. Each directory includes other direc- 

to backup media element 72 to ensure disaster recovery lones and/or data mes - A Ml data file name B represented 

capability. To more efficiently manage data files in the by concatenating all the directory tree structure components 

hierarchy, the operations kernel 501 can assemble a plurality from the root t0 the particular data file, with components 

of data files into a transfer unit of predetermined size for bem S dehmited °y "V. An example of such a data file name 

continued migration to lower levels in the hierarchy. A 35 using this convention is "\vol\dirl\dir3\filename.ext". Each 

candidate size for the transfer unit is a standard object size D0S volume on the file server has ils own unia * ue file 

for the media that is used to implement the first layer 311 of s y stem * ^ P h y sica l space allocation on the data storage 

the secondary storage 52. It is desirable that the transfer media 15 accomplished by the use of a File Allocation Table 

units that are used in the secondary storage 52 fit into all < FAI ^ ^ data stora S e s P ace 00 a D0S volume 15 *S- 

media with minimum boundary fragmentation. «o meQted int0 allocation units termed clusters. All directory 

The data files that are written to the migration volumes 61 and data ^ names in the volume are listed in the file 

and shadow volumes 64 have their relative storage location allocatlon table a ° d hierarchically related by linkages 

identification written into a secondary storage directory b *™? P arenLs and ch ! ldren m tb f tree - When a 

owned by the storage server 50. This secondary storage data flle na ™ » e " ter f d m j° the fde allocatl u on tab ^/pace 

directory can be implemented entirely within the storage « 15 also Prided for data file attributes such as hidden or 

server 50, but would take up a great deal of data storage «^nly and the identification of the first cluster used to 

space and be difficult to protect. Instead, this secondary St0re th j\ data *° ™ l u " "f^ 10 ™ 1 CU f ?. are 

storage directory is distributed in the form of directory r ? qmred t0 Store thlS daU ^ these c J^ teis are hnked 

segments among the file servers 41^3 that contain managed C ^ ain V ; a Pointers with the entire chain representing the 

volumes 31-33 for the processors 21, 22, with each direc- 50 Physical location of the data file on the data storage media, 

tory segment representing the secondary storage directory Transfer Units 
531 for the managed volume on the primary data storage 

device 31-33. The placeholder entry is contained in the ^ data management system 50 of this invention makes 

secondary storage directory 531. Thus, the processor 21 that use of a differe nt directory structure to manage the storage 

requests access to this migrated data file can obtain the 55 of data files on the data storage media of the secondary 

requested data file without being aware of the existence of stora S e 52 - ^ storage and relocation of data files among 

the secondary storage 52. This is accomplished (as described the vanous layers of the secondary storage 52 is simplified 

in detail below) by the storage service agent 121, which by the use of transfer units. A transfer unit represents a block 

obtains placeholder entry from the data file directory 511, of dala of predetermined size which contains virtual file 

which points to the directory segment in the secondary 60 s y stem ob J ects ( e g- data files ) tnat move to g ethe r to the 

storage directory 531. This identified directory segment in backu P s y stem and through the hierarchy, with each transfer 

the secondary storage directory 531 contains the address in unit bem S assigned a unique identification within the data 

the migration volume that contains the requested data file. management system. 

As noted above, the operations kernel 501 of the storage 

Systems 65 processor 51 orders data files in each managed 

The data management system makes use of a file system volume of the file systems 41-43 according to a predeler- 

structure that provides a common repository for the poten- mined algorithm. The ordering can be based on data file 
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usage, content, criticality, size or whatever other criteria is via network 1 to storage server 50. At step 808, operations 

selected. For the purpose of illustration, a simple least kernel 501 uses systems services 505 which uses the migra- 

recently used (LRU) ordering is described. The operations tion key to directly retrieve the entry in secondary storage 

kernel 501 orders the data files in each managed volume on directory 531. This identified entry in the secondary storage 

an LRU basis and the entries on the bottom of the list 5 directory 531 contains the relative storage location informa- 

represent migration candidates. The operations kernel 501 tion that consists of the transfer unit identification and 

periodically sweeps the migration candidate data files from position of the data file in the transfer unit. The device 

the managed volumes and assembles them serially by man- manager 504 uses the data file address information to recall 

aged volume into a transfer unit containing a plurality of the requested data file from the data storage device on which 

data files. The full data file name is entered into the sec- 10 it is stored. This data storage device can be at any level in 

ondary storage directory 531, together with relative data file the hierarchy, as a function of the activity level of the data 

location information: the location of the data file within the file. 

transfer unit, transfer unit identification, media object iden- Dev j ce man ager 504 reads the data file from the relative 
tification. The data file name is always logically related to storage location identified in the secondary storage directory 
the original transfer unit identification, the data file is never is 531 and places the retrieved data file on the network 1 for 
moved to another transfer unit, but remains in the transfer transmission to the file server 41 and volume 31 that 
unit with the other temporally related data files from each originally contained the requested data file. This is accom- 
virtual file system at the time of migration to secondary pi^ed by translating the relative storage location informa- 
storage 52. The media object is itself associated with transfer t i on int0 an identification of the exact physical storage 
units, not data files. In this manner, one directory is used to 20 location in secondary storage 52 that contains the requested 
note the correspondence (relative storage location) between data fi i e . The data file is then read out of this identified 
data files and transfer unit and a second directory located in physical storage location. Systems services 505 of opera- 
storage server processor 51 is used to note the correspon- tions kernel 501 then updates the data file directory 511 to 
dence (physical storage location) between transfer units and indicate that the data file has been recalled to the network 
media object. When transfer units are relocated from one 25 volume. At step 811, control is returned to file server 41, 
media to another, the data file directory need not be updated which reads data file directory 511 to locate the requested 
since the data files remain in the original transfer unit and it data file. The data file directory 511 now contains informa- 
is simply the change in physical location of the transfer unit tion that indicates the present location of this recalled data 
on the media that must be noted. file on data storage device 31 proce ssor 21 can then 
Data File Recall 30 directly access the recalled data file via the file server 41. 



As illustrated in flow diagram form in FIG. 8 and with 
reference to the system architecture in FIG. 5, a data file 
recall operates in substantially the reverse direction of data 
file migration. As noted above, the data files that are written 35 
to the migration volumes 61 and shadow volumes 64 have 
their relative storage location identification written into a 
secondary storage directory 531 in the file server 41. The 
placeholder entry in data file directory 511 on the file server 
41 points to this secondary storage directory segment. Thus, 40 
the processor 21 at step 801 requests access to this migrated 
data file and this request is intercepted at step 802 by a trap 
or interface 711 in the file server 41. The trap can utilize 
hooks in the file system 41 to cause a branch in processing 
to the storage server agent 121 or a call back routine can be 45 
implemented that allows the storage server agent 121 to 
register with the file system 41 and be called when the data 
file request is received from the processor 21. In either case, 
the trapped request is forwarded to storage server agent 121 
to determine whether the requested data file is migrated to 50 
secondary storage 52. This is accomplished by storage 
server agent 121 at step 803 reading data file directory 511 
to determine the location of the requested data file. If a 
placeholder entry is not found stored in data file directory 
511 at step 805, control is returned to the file server 41 at 55 
step 806 to enable the file server 41 to read the directory 
entry that is stored in data file directory 511 for the requested 
data file. The data stored in this directory segment enables 
the file server 41 to retrieve the requested data file from the 
data storage device 31 on which the requested data file 60 
resides. 

If at step 805, storage server agent 121 determines, via the 
presence of a placeholder entry, that the requested data file 
has been migrated to secondary storage 52, storage server 
agent 121 at step 807 creates a data file recall request and 65 
transmits this request together with the direct access sec- 
ondary storage migration key stored in the placeholder entry 



Movable Placeholders 

The placeholders noted above are movable, in that the 
data storage management system 50 provides the capability 
to move and/or copy the placeholder files from one file 
server volume to another file server volume, even if the 
destination file server volume resides on a different file 
server. This is accomplished by the use of unique and 
immutable migration keys which are included in the place- 
holder entries to identify the location of the associated data 
file with absolute certainty. Since the system users and 
administrators can rename storage server, the unique storage 
server identifier includes timestamp data which relates the 
storage server name with a specific point in time, which 
enables the data storage management system 50 to accu- 
rately track the present identity of the storage server. 

In addition, a duplicate copy of the placeholder catalog 
file is maintained in the data storage management system 50 
to prevent loss of file system integrity in the event that the 
active placeholder catalog file is corrupted. This placeholder 
data is maintained in a placeholder volume catalog 541 
which provides users with a file -system-like view of the 
placeholder files which reside on the selected file system 
volume. The placeholder volume catalogs 541 are not used 
for file access purposes, but, instead, are used to ensure that 
placeholder restoration can be accomplished in the event 
that the placeholders are corrupted and to provide the users 
with data volume content viewing. Each placeholder volume 
catalog 541 resides on the file server to which it pertains and 
tracks the placeholders for all HSM servers. In this manner, 
the integrity of the placeholders is ensured. 

Storage Server Identifier 

Once placeholders are allowed to be moved by the users 
to file server volumes that are managed by other storage 
servers, a mechanism is required that can uniquely identify 
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the storage server to which the placeholder data has been 
transferred or copied. This identifier is included in the 
placeholder data which is scattered across numerous place- 
holder files and therefore must be immutable in nature. The 
name of the storage server is an inadequate identifier, since 5 
the name can be changed over time. The identifier used in 
the present data storage system comprises a combination of 
the storage server name and timestamp created by the 
operating system, which results in a unique identifier. The 
timestamp is typically the exact time that the identifier is 10 
created, so that the storage server identifier is uniquely 
mapped to a physical entity via the timestamp data. Once 
created, the identifier is never modified, since any modifi- 
cation would destroy the immutability of the identifier. 

15 

Placeholder Data 

The placeholder data maintained in the data storage 
management system extended attributes for a data file which 
has been migrated comprises for example, as shown in FIG. 
2, the following data items: 

Placeholder Version — the version of the placeholder data 
format 

Placeholder Flags — internal flags for placeholder opera- 
tion 25 

Storage Server Identifier — a pointer which identifies the 

Storage Server that holds the migrated data for this 

placeholder file 
Transfer Unit Identifier — an identification of the transfer 

unit which contains the migrated data for this place- 30 

holder file 

Location — an identification of the location within the 

transfer unit that contains the migrated data for this 

placeholder file 
Secondary Size — the size in bytes of the data which was 

migrated to secondary storage for this placeholder file 
Primary Size — the size in bytes of the original data file for 

this placeholder file 
Modify Stamp — the date and time of the last modification 40 

to this data file prior to the initial migration of this 

version of the data file 
Access Stamp — the date and time of the last access to this 

data file prior to the initial migration of this version of 

the data file 45 
Migrate Stamp — the date and time of the initial migration 

of this version of the data file 
Recall Count — the number of times that of this version of 

the data file has been recalled 
Verification Size — the number of bytes of the verification 50 

data that are to be used for recall verification of this 

data file 

Verification Data — this data represents information which 
is used to ensure that the correct data file is being 55 
recalled 

Placeholder Volume Catalogs 

For each managed file server volume in primary storage, 
a placeholder volume catalog 541 is maintained that 60 
provides a file-system-like view of the placeholder files 
which are stored on the file server volume. The place- 
holder volume catalog 541 is not required for data 
access, since it is used solely for user viewing and 
placeholder restoration. Each placeholder volume cata- 65 
log 541 resides on the file server volume to which it 
pertains, and is not mirrored on the storage server 50. 
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The placeholder volume catalog 541 tracks all place- 
holder files for all file servers and therefore enables a 
user to view, in hierarchical fashion, a catalog of all 
data files migrated from a selected file server volume. 
These data files include data files whose placeholders 
have been deleted, but whose data still resides in 
secondary storage, including all past migrated versions 
of the data file. The placeholder volume catalog 541 
therefore represents a complete history of data file 
activity on the associated file server volume for all data 
files still resident within the data storage subsystem. 
This history attribute of the placeholder volume catalog 
541 enables the data management system 50 to restore 
both corrupted and deleted placeholder files. 

Recall Initiator 

The ability to move placeholders from one file server 
volume to another file server volume makes the job of 
cleaning up orphaned placeholders from media and overflow 
transfer unit discard operations more difficult. An orphaned 
placeholder is created when a recall operation has been 
attempted on a placeholder that points to a data file that has 
been discarded from the secondary storage. The recall 
initiator process resident on the storage server agent 
121-123 scans a file server volume, removing any orphaned 
placeholders as they are encountered. As orphaned place- 
holders are encountered, the recall initiator process also 
deletes the corresponding placeholder data from the volume 
catalog. 

Summary 

The data storage management system provides the capa- 
bility to move and/or copy the placeholder files from one file 
server volume to another file server volume, even if the 
destination file server volume resides on a different file 
server. This is accomplished by the use of unique and 
immutable migration keys which are included in the place- 
holder entries to identify the location of the associated data 
file with absolute certainty. In addition, a duplicate copy of 
the placeholder catalog file is maintained in the system to 
prevent loss of file system integrity in the event that the 
active placeholder catalog file is corrupted. This placeholder 
data is maintained in a placeholder volume catalog which 
provides users with a file-system-like view of the place- 
holder files which reside on the selected volume. 

We claim: 

1. A data storage management system for a data network 
which functions to interconnect a plurality of file servers, 
each of which stores data files in a plurality of file server 
volumes, said data storage management system comprising: 
directory means, located in each of said plurality of file 
servers, for identifying a storage location in said plu- 
rality of file server volumes of each data file stored on 
said file server; 
secondary storage means for storing data files migrated 

from said plurality of file servers; 
storage server means connected to said network for auto- 
matically managing transfer of data files between said 
plurality of file servers and said secondary storage 
means, comprising: 

means for migrating selected data files from said plu- 
rality of file servers to said secondary storage means, 

means for writing in said directory means at a file 
server volume directory location for each of said 
migrated selected data files, placeholder data indi- 
cating that said migrated selected data file has been 
migrated to said secondary storage means, and 
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means, responsive to a one of said migrated selected 
data files being transferred from a first of said 
plurality of file server volumes on a selected file 
server to a second of said plurality of file server 
volumes on a one of said plurality of file servers, for 5 
transferring said placeholder data, written in said 
directory means at a file server volume directory 
location for said migrated selected data file on said 
selected file server, to a said directory means at a file 
server volume directory location for said migrated 
selected data file on a one of said plurality of file 
servers associated with said second of said plurality 
of file server volumes. 

2. The system of claim 1 wherein said storage server 
means further comprises: 

means for storing data indicative of a physical data 15 
storage location that identifies a locus in said secondary 
storage means of each of said selected data files 
migrated to said secondary storage means. 

3. The system of claim 2 further comprising: 

means, located in each of said plurality of file servers, for 20 
intercepting a call at a selected file server to a data file 
that has been stored in said file server; 

means, responsive to said placeholder data written in said 
directory means, for recalling said requested data file 
from said secondary storage means to said file server, 25 
comprising: 

means for reading said placeholder data stored in said 
directory means to identify a physical data storage 
location in said storing means that contains data 
which identifies a locus in said secondary storage 30 
means of said requested migrated data file, 

means for retrieving said data stored in said identified 
physical storage location in said storing means, and 

means, responsive to said retrieved data, for transmit- 
ting said requested migrated data file from said locus 35 
in said secondary storage means to said selected file 
server. 



4. The system of claim 1 wherein said placeholder data, 
written by said writing means in said directory means, is 
stored as part of the data file attributes. 

5. The system of claim 1 wherein said storage server 
means further comprises: 

means, responsive to said means for writing said place- 
holder data, for writing a copy of said placeholder data 
into a volume catalog for all data files migrated from 
said file server volume to said secondary storage 
means. 

6. The system of claim 5 further comprising: 

means for enabling a user to recall said placeholder data 
stored in said volume catalog. 

7. The system of claim 2 wherein said secondary storage 
means comprises: 

a multi-layer hierarchical memory, wherein said layers in 
said hierarchical memory comprise media of differing 
characteristics. 

8. The system of claim 7 wherein said storage server 
means further comprises: 

means for collecting a plurality of data files, that are 
transmitted to said secondary storage means, into a 
transfer unit; 

means for storing said transfer unit on a first layer of said 
hierarchy. 

9. The system of claim 8 wherein said means for storing 
data comprises: 

media object directory means for storing data indicative 
of a correspondence between a transfer unit and a 
media on which said transfer unit is located. 
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